Handling Outlandish Occurrences: Using Rules and Lexicons for Correcting NLP Articles
نویسندگان
چکیده
This article describes the experiments we performed during our participation in the HOO Challenge. We present the adaption we made on two systems, mainly designing new grammatical rules and completing a lexicon. We focused our work on some of the most common errors in the corpus: missing punctuation and inaccurate prepositions. Our best experiment achieved a 0.1097 detection score, a 0.0820 recognition score, and a 0.0557 correction score on the test corpus.
منابع مشابه
Mining Association Rules from Clinical Narratives
We propose a method that processes raw informal medical texts (from health forums) and formal texts (outpatient records) in Bulgarian language in order to extract typical word co-occurrences in the form of association rules. When mining these rules we use some context information and small terminological lexicons to generalize the extracted frequent patterns. This allows to study informal expre...
متن کاملJoint Word Representation Learning Using a Corpus and a Semantic Lexicon
Methods for learning word representations using large text corpora have received much attention lately due to their impressive performance in numerous natural language processing (NLP) tasks such as, semantic similarity measurement, and word analogy detection. Despite their success, these datadriven word representation learning methods do not consider the rich semantic relational structure betw...
متن کاملMultilingual Computational Semantic Lexicons in Action: The WYSINNWYG Approach to NLP
Much effort has been put into computational lexicons over the years, and most systems give much room to (lexical) semantic data. However, in these systems, the effort put on the study and representation of lexical items to express the underlying continuum existing in 1) language vagueness and polysemy, and 2) language gaps and mismatches, has remained embryonic. A sense enumeration approach fai...
متن کاملA New Document Embedding Method for News Classification
Abstract- Text classification is one of the main tasks of natural language processing (NLP). In this task, documents are classified into pre-defined categories. There is lots of news spreading on the web. A text classifier can categorize news automatically and this facilitates and accelerates access to the news. The first step in text classification is to represent documents in a suitable way t...
متن کاملAnnotating events, Time and Place Expressions in Arabic Texts
We present in this paper an unsupervised approach to recognize events, time and place expressions in Arabic texts. Arabic is a resource –scarce language and we don’t easily have at hand annotated corpora, lexicons and other needed NLP tools. We show in this work that we can recognize events, time and place expressions in Arabic texts without using a POS annotated corpus and without lexicon. We ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011